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1.  Introduction 


In  recent  years,  learning  scientists  have  taken  a  keen  interest  in  drawing  links  between  affect  and 
learning.  Indeed,  researchers  have  documented  a  dynamic  relationship  between  affect  and 
learning  in  which  some  affective  states  are  antecedents  of  learning  outcomes,  while  others  are 
consequences  of  these  outcomes  (Guia  2013).  In  addition  to  the  well-documented  impact  of 
cognition,  motivation,  discourse,  action,  and  the  environment  on  learning,  the  assumption  that 
affect  is  inextricably  bound  to  learning  is  widely  supported  by  various  studies  (D’Mello  2007). 
For  example,  the  four-quadrant  model,  which  was  proposed  by  Kort  (2001),  clearly  links 
learning  success  or  failure  and  affective  states.  During  attempts  for  solving  complicated  tasks  or 
for  learning  complex  material,  students  may  encounter  many  different  emotions  (i.e.,  affective 
states).  They  might  be  highly  motivated  by  curiosity  or  fascination  about  a  new  topic  at  the 
beginning  of  their  task  and  get  puzzled  or  frustrated  about  some  misconception  during  further 
processing.  An  affect-receptive  learning  system  would  ideally  recognize  these  states  and 
response  to  students’  emotions  to  increase  motivation  and  improve  learning  gains. 

In  order  to  automatically  identify  and  process  affect  states,  such  a  learning  system  needs  to  be 
fed  with  appropriate  affect  information.  Passive  sensors,  which  capture  the  physical  state  or 
behavior  of  the  learner,  can  be  used  to  gather  this  infonnation.  The  affective  computing  (AC) 
literature  shows  that,  because  of  their  less  intrusive  character,  camera-  or  microphone-based 
systems  are  typically  preferred  to  collect  students’  emotional  infonnation.  A  video  camera  might 
capture  facial  expressions,  body  posture,  and  gestures,  while  a  microphone  might  capture  speech 
(Malatesta  2009).  However,  physiological  signals,  such  as  skin  temperature  and  blood  volume 
pulse,  can  be  used  in  AC  systems  to  identify  emotions  by  analyzing  the  pattern  of  physiological 
changes.  The  amount  of  data  that  such  signals  can  provide  is  increasing  primarily  due  to  2 
reasons:  1)  the  precision  and  data  analysis  techniques  are  improving  and  2)  sensors  are  becoming 
smaller  and  the  use  of  wearable  devices  is  increasing. 

Usually  physiological  signals  are  recorded  by  equipment  and  techniques  that  are  more  intrusive 
than  those  recording  facial  and  vocal  expression.  For  example,  to  measure  heart  rate  (HR)  or 
heart  rate  variability  (HRV),  the  subject  typically  needs  to  be  wired  up  with  wet  adhesive 
electrodes  on  the  skin  and  cables,  which  are  fixed  somewhere  in  the  environment.  This  is  often 
perceived  as  being  uncomfortable  and  unpleasant.  These  conventional  methods  for  obtaining 
physiological  information  are  still  essential  and  important  for  medical  as  well  as  research 
purposes  (Liu  2012).  Electrocardiographs  (ECGs),  which  traditionally  monitor  the  electrical 
activity  of  the  heart  over  a  period  of  time  by  electrodes  attached  to  the  surface  of  the  skin,  are 
used  in  almost  every  clinical  environment.  Pulse  oximeters,  which  measure  the  blood-oxygen 
saturation  level  and  pulse  rate  by  using  the  light-absorbing  characteristics  of  human  blood  at 
certain  wavelengths,  are  commonly  used  during  surgery.  Capnographs  monitor  respiratory  status 
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and  pulse  of  patients  during  anesthesia  and  intensive  care.  These  methods  are  proven  and  well 
established  in  the  clinical  field.  Nevertheless,  in  non-clinical  fields,  they  have  some 
disadvantages.  The  main  limitation  in  the  use  of  these  conventional  methods  is  obviously  their 
requirement  of  maintaining  a  direct  contact  with  the  surface  of  the  subject’s  skin.  This  is 
especially  a  disadvantage  in  learning  environments,  where  such  devices  may  have  a  negative 
impact  on  the  student’s  behavior  and  the  learning  process  because  they  are  undesirable  and 
inconvenient.  Therefore,  there  is  an  increasing  interest  in  the  development  and  use  of  non- 
contact  methods  for  measuring  physiological  infonnation. 

One  of  the  most  promising  low-cost,  non-contact,  and  non-intrusive  methods  is  remote 
photoplethysmographic  imaging  (iPPG).  In  this  work,  an  ambient  light  and  webcam-based  iPPG 
method  is  examined  for  its  accuracy,  sensitivity,  and  reliability  in  measuring  human  heartbeats. 
The  following  sections  provide  a  quick  overview  of  the  underlying  technique  of  iPPG. 

1.1  Photoplethysmography 

Detection  of  the  cardiovascular  pulse  wave  traveling  through  the  body  is  referred  to  as 
plethysmography  (plethysmos  =  increase  in  Greek)  and  can  be  done  by  means  such  as  variations 
in  air  pressure,  impedance,  or  strain  (Verkruysse  et  al.  2008).  Photoplethysmography  (PPG)  is  an 
optical  measurement  technique  that  can  be  used  to  detect  blood  volume  changes  in  the 
microvascular  bed  of  tissue.  It  has  widespread  clinical  application,  with  the  technology  used  in 
commercially  available  medical  devices,  for  example,  pulse  oximeters,  vascular  diagnostics,  and 
digital  beat-to-beat  blood  pressure  measurement  systems  (Allen  2007).  PPG  is  based  on  the 
principle  that  blood  absorbs  light  more  than  surrounding  tissue,  so  a  change  in  blood  volume 
affects  transmission  or  reflectance,  correspondingly.  The  observed  pulse  wave,  the  PPG  signal 
(blood  volume  pulse),  is  caused  by  the  periodic  pulsations  of  arterial  blood  within  the  peripheral 
vasculature  and  is  discernible  by  the  dynamic  optical  absorption  that  this  induces  in  well- 
perfused  peripheral  tissue  (Hayes  1998).  A  basic  fonn  of  PPG  technology  requires  only  a  few 
opto-electronic  components:  a  light  source  to  illuminate  the  tissue  (e.g.,  skin)  and  a 
photodetector  to  measure  the  small  variations  in  light  intensity  associated  with  changes  in 
perfusion  in  the  catchment  volume  (Allen  2007).  A  typical  application  of  this  principle  is  pulse 
oximetry. 

1.2  Pulse  Oximetry 

A  pulse  oximeter  monitors  the  blood-oxygen  saturation  level  and  pulse  rate  in  the  human  blood 
by  using  the  light-absorbing  characteristics  of  oxyhemoglobin  (oxygenated  hemoglobin)  and 
deoxyhemoglobin  (deoxygenated  hemoglobin)  at  certain  wavelengths  (i.e.,  660  mn  red  and 
940  mn  infrared)  and  the  pulsating  nature  of  arterial  blood  flow.  With  pulse  oximeters,  a  finger 
or  earlobe  probe  is  used:  a  red  light-emitting  diode  (LED)  and  an  infrared  LED  are  located  on 
one  side  of  the  probe,  and  a  photodetector  is  located  on  the  other  side  (Mengelkoch  1994).  The 
photodetector  measures  the  amount  of  red  and  infrared  light  received  by  the  detector  and 
calculates  the  amount  absorbed.  Oxyhemoglobin  absorbs  more  infrared  light  components  and 
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deoxyhemoglobin  more  red  light  components.  The  blood-oxygen  saturation  level  can  then  be 
calculated  by  the  comparison  of  the  amounts  of  red  and  infrared  light  received.  With  each  heart 
beat,  there  is  a  surge  of  oxygenated  blood  and  a  slight  increase  of  the  arterial  blood  volume.  This 
results  in  more  infrared  light  absorption  during  the  surge  and  can  be  represented  as  a  heartbeat 
(Mardirossian  1992). 

1.3  Photoplethysmographic  Imaging 

The  requirement  that  PPG  methods  must  maintain  direct  contact  with  the  skin  of  the  subject 
limits  the  use  of  pulse  oximeters  in  certain  areas,  for  example,  where  special  emphasis  is  placed 
on  mobility  or  skin  contact  is  inappropriate.  However,  this  problem  can  be  solved  by  using 
remote,  non-contact  pulse  oximetry  and  iPPG  methods.  Such  methods  have  been  increasingly 
studied  in  recent  years. 

In  2007,  Kenneth  Humphreys  and  Tomas  Ward  presented  a  camera-based  device  capable  of 
capturing  two  PPG  signals  at  two  different  wavelengths  simultaneously,  in  a  remote  non-contact 
manner.  The  system  comprises  a  complementary  metal-oxide  semiconductor  (CMOS)  camera 
and  a  dual  wavelength  array  of  LEDs  (760  and  880  nm).  The  camera  was  positioned  and  focused 
to  an  area  of  the  volar  side  of  the  foreann  close  to  the  wrist.  The  experiment  resulted  in  very 
accurate  measurement  results  when  measuring  HR;  however,  it  was  still  dependent  on  the  use  of 
special  LEDs.  In  2008,  Verkruysse  et  al.  measured  remotely  (>1  m)  using  ambient  light  instead 
of  special  LEDs.  They  used  a  simple  consumer-level  digital  photo  camera  in  movie  mode  and 
showed  that  PPG  signals  can  be  remotely  measured  on  the  human  face  without  dedicated  red  and 
infrared  light  sources.  Their  results  also  showed  that  ambient  light  photos  may  be  useful  for 
medical  purposes  such  as  characterization  of  vascular  skin  lesions  (e.g.,  port  wine  stains)  and 
remote  sensing  of  vital  signs  (e.g.,  heart  and  respiration  rates)  for  triage  or  sports  purposes.  Poh 
et  al.  implemented  an  advanced  algorithm  in  2010  and  showed  simultaneous  assessment  of 
multiple  people  heartbeats  with  a  basic  laptop  embedded  webcam. 

So  far  studies  in  this  area  have  been  involved  either  high-perfonnance  cameras  and/or  webcam- 
based  systems.  In  2012,  Sun  et  al.  examined  the  quality  of  the  physiological  infonnation  that 
could  be  acquired  with  these  types  of  camera  systems  by  comparing  them  to  the  data  that  a 
commercial  pulse  oximeter  sensor  can  provide.  The  outcome  of  their  experiments  shows  that  the 
result  of  both  the  high  performance  camera  and  the  webcam-based  system  coincide  to  a  large 
extend  with  the  results  of  the  conventional  pulse  oximeter. 


2.  Methodology 


All  of  the  previously  presented  methods  are  based  substantially  on  the  same  procedure.  To 
monitor  physiological  information,  a  region  of  interest  (ROI)  on  the  forehead  or  wrist  of  the 
subject  was  recorded  with  a  camera  lens  under  a  dedicated  light  source  or  ambient  light.  The 
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subject  was  asked  to  keep  the  examined  ROI  still  to  prevent  possible  motion  artifacts.  For  post¬ 
experiment  analysis,  the  video  stream  is  then  stored  in  an  uncompressed  AVI  fonnat  on  a 
personal  computer  (PC).  Afterwards,  the  ROI  is  isolated  from  the  video  data  and  decomposed 
into  the  three  red,  green,  blue  (RGB)  channels  using  the  Matlab  software  (The  Math  Works  Inc) 
and  the  Open  Computer  Vision  (OpenCV)  library.  The  color  data  of  each  channel  are  then 
spatially  averaged  to  obtain  a  reference  color  value  for  each  frame.  For  further  analysis, 
however,  only  the  green  channel  was  used  in  previous  studies.  This  is  because  there  is  higher 
green  light  absorption  by  hemoglobin  than  for  red  or  blue  light.  It  has  been  shown  in  former 
studies  that  a  strong  cardiac  pulse  signal  can  be  isolated  only  from  the  green  channel  (Wim 
Verkruysse,  2008).  By  calculating  an  average  green  value  from  each  frame,  a  sample  signal  was 
detennined.  Followed  by  a  fast  Fourier  transform  (FFT)  on  the  detected  signal,  a  pulse  signal 
could  eventually  be  isolated. 

In  contrast  to  the  previously  presented  offline  post-processing  methods,  in  this  work,  a  fully 
automatic  live-processing  method  has  been  applied  and  tested  for  its  reliability  and  accuracy.  For 
this  purpose,  a  Python  application  (webcam-pulse-detector)  that  detects  and  highlights  the  HR  of 
an  individual  in  real  time  was  used.  The  code  for  this  application  was  developed  and 
implemented  by  Tristan  Heam  at  National  Aeronautics  and  Space  Administration  (NASA)- 
Glenn  Research  Center  in  support  of  OpenMDAO,  under  the  Aeronautical  Sciences  Project  in 
NASA’s  Fundamental  Aeronautics  Program,  as  well  as  the  Crew  State  Monitoring  Element  of 
the  Vehicle  Systems  Safety  Technologies  Project,  in  NASA’s  Aviation  Safety  Program  (Heam 
2013).  The  HR  detection  algorithm  used  is  inspired  by  the  recent  work  on  Eulerian  video 
magnification  by  Wu  et  al.  (2012)  The  application  uses  the  OpenCV  library  to  find  the  location 
of  the  user’s  face  and  isolate  the  forehead  region.  Through  spatial  decomposition,  followed  by 
temporal  filtering  to  the  frames  and  then  amplifying  the  resulting  signal,  the  flow  of  blood  as  it 
fills  the  face  can  be  visualized  and  a  HR  can  be  calculated  in  real  time. 

The  objective  of  this  study  is  to  test  the  used  algorithm  under  different  conditions  on  its 
robustness  and  accuracy  compared  with  an  ECG  as  a  gold  standard. 

In  addition  to  the  assessment  of  the  accuracy,  in  particular  the  influence  of  varying  ambient  light, 
the  distance  between  the  measuring  object  and  camera,  as  well  as  the  influence  of  movements  of 
the  measuring  objects  on  the  measurement  results  is  investigated. 

2.1  Materials  and  Setup 

As  a  sensor,  a  simple  and  low-cost  2-megapixel  webcam  from  Logitech  (Logitech  2  MP  HD 
Webcam  C600)  was  used  and  connected  via  universal  serial  bus  (USB)  to  a  standard  laptop 
computer.  The  default  Microsoft  webcam  drivers  were  used.  The  camera  was  mounted  centrally 
on  the  laptop  lid  and  directed  toward  the  face  of  the  subject.  The  computer  used  was  an 
Alienware  M17x  R3  with  Intel®  Core  (TM)  i7  2630QM  processor  with  4  GB  of  RAM.  In  the 
system,  a  Python  2.7  environment  with  the  OpenCV  library  was  provided  for  Hearn’s  webcam- 
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pulse-detector  application.  The  source  code  that  has  been  published  by  Hearn  was  only 
minimally  modified  for  the  purpose  of  logging. 

As  a  reference  system  and  the  gold  standard,  an  MP-150  BIOP  AC  data  acquisition  system  in 
combination  with  an  ECG  amplifier  module  (ECG100C)  was  used.  The  ECG100C  is  a  single 
channel,  high  gain,  differential  input,  biopotential  amplifier  designed  specifically  for  monitoring 
the  heart’s  electrical  activity.  HR  data  were  recorded,  processed,  and  analyzed  with  the  BIOP  AC 
software  AcqKnowledge. 

Four  subjects  participated  in  this  study.  The  subjects’  demographic  information  is  gender 
(3  males,  1  female),  age  range  (28-37  years  old),  and  race/ethnicity  (Caucasian  and  African- 
Americans).  The  experiments  were  conducted  indoors  under  relatively  controlled  lighting 
conditions.  To  simulate  good  light  conditions,  the  blinds  of  the  room  remained  opened  and  a 
large  amount  of  daylight  (1200  lx)  flooded  the  room.  Nevertheless,  the  illuminance  of  ambient 
light  on  the  face  of  the  subject  varying  between  800  and  1200  lx,  depending  on  whether  the  sun 
was  just  obscured  by  a  cloud  or  not.  To  simulate  low-light  conditions  of  the  room,  the  blinds 
were  closed.  The  luminous  intensity  of  the  ambient  light  on  the  subject  then  fell  to  a  range 
between  10  and  50  lx.  The  illuminances  were  checked  with  a  standard  lux  meter  for  measuring 
illuminances  in  workplaces  before  each  of  the  respective  experiments. 

2.2  Experimental  Protocol 

Before  starting  the  experiment,  the  participants  were  informed  about  the  experimental  setup  and 
procedure,  and  wired  to  the  ECG  module.  Three  BIOP  AC  EL503  pre-gelled  and  medium 
adhesive  electrodes  were  used  to  attach  the  wires.  The  electrodes  were  fixed  in  Einthoven’s 
triangle  formation  to  ankles  and  wrist  (3-lead  ECG).  The  participants  were  asked  to  sit  in  front  of 
the  camera  and  minimize  the  movement  at  the  points  at  which  the  electrodes  had  been  mounted. 

Each  session  with  the  participant  comprised  16  trials  (or  runs),  and  each  run  lasted  5  min.  For 
each  parameter  combination,  a  particular  run  was  perfonned.  In  this  study,  the  impact  of  the  4 
parameters — lighting,  motion,  distance,  and  face  lock — were  examined.  For  the  lighting 
parameter,  the  room  was  either  darkened  by  closing  the  blinds  or  illuminated  with  daylight 
through  open  blinds  in  combination  with  nonnal  artificial  fluorescent  light  from  the  ceiling.  For 
the  distance  parameter,  the  participants  were  asked  to  position  themselves  either  very  closely  to 
(1.5  ft)  or  more  distantly  from  (3  ft)  the  front  of  the  camera.  For  motion  parameter,  participants 
were  asked  to  remain  as  still  as  possible  or  simulate  a  natural  movement  pattern  in  front  of  the 
screen.  For  the  latter,  the  participants  were  asked  to  move  their  head  in  smooth  and  slow 
movements  within  a  maximum  angle  of  45°  in  any  direction.  The  requested  movements  included 
nodding  and  tilting  the  head,  looking  up/down  and  at  the  left/right  comers  of  their  computer 
screen,  and  making  spontaneous  facial  expressions.  With  the  motion  parameter,  the  goal  was  to 
see  whether  the  algorithm  was  able  to  balance  simple  and  slow  movements  or  whether  these 
slight  movements  lead  to  inaccuracies  in  the  measurement.  The  face  lock  parameter  tested 
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whether  a  fixed  focus  on  a  specific  forehead  position  resulted  in  better  measurement  data  than 
the  automatic  face  tracking  option  of  the  pulse  detector  application. 


3.  Results 


A  total  of  16  sets  of  measurements  were  taken  from  each  participant.  The  results  were 
summarized  over  all  participants  and  visualized  in  Bland  Altman  plots  (Bland  1986)  to 
graphically  evaluate  the  average  discrepancy  (bias)  between  the  webcam  and  ECG  method.  The 
differences  between  the  two  measurements  at  a  given  time  were  plotted  against  the  average 
difference  between  the  two  methods.  Figure  1  shows  an  example  of  2  of  the  evaluated  16  plots. 
Table  1  also  provides  an  overview  of  the  results  of  all  16  experiments. 


-in  J  -?n  J 


Fig.  1  Bland  Altman  plots  (x-axis:  difference  of  HR  measurements  between  webcam  and  ECG  method;  j-axis 
shows  differences  between  both  methods).  The  right  plot  shows  the  results  under  no-light,  distance;  no¬ 
movement,  and  no-face-lock  condition.  The  left  plot  shows  the  no-light,  no-distance,  movement,  and 
face-lock  condition. 


Table  1  Overview  of  the  results  of  all  16  experiments 


No. 

L 

D 

M 

FL  Mean  Difference  SD  Mean  Diff.  +  2  SD  Mean  Diff. 

2  SD 

1 

0 

0 

0 

on 

0.12051  2.8370  ■ 

5.7946  B 

5.5536 

2 

0 

0 

0 

IB 

0.4557|  2.5735 1| 

4.69131 

5.6026 

3 

0 

0 

1 

oh 

1.3316  M  7.8826  ^^B 

14.4336 

17.0968 

4 

0 

0 

1 

IB 

3.3564|  3.8750| 

4.3936  ■ 

11.1064 

5 

0 

1 

0 

on 

0.19731  2.2827  B 

4.76271 

4.3681 

6 

0 

1 

0 

IB 

0.2418 1  2.2728U 

4.7875 1 

4.3038 

7 

0 

1 

1 

OB 

0.5204  8.4855 

17.4914 

16.4507 

8 

0 

1 

1 

1  [[ 

2.5915  B  6.3091 

10.0267 

15.2097 

9 

1 

0 

0 

OB 

0.13511  2.8224  ■ 

5.7799  B 

5.5097 

10 

1 

0 

0 

IB 

0.0766  D  2.7694  H 

5.4623 1 

5.6154 

11 

1 

0 

1 

0 

15.5239  9.9197 

4.3155 

35.3634 

12 

1 

0 

1 

1 

4.3981  6.4414  ^B 

8.4847 

17.2810 

13 

1 

1 

0 

on 

0.2481  B  3.6735  ■ 

7.0990  H 

7.5951 

14 

1 

1 

0 

IB 

0.2211  n  2.9463  ■ 

6.11381 

5.6715 

15 

1 

1 

1 

0  ■ 

4.2695  7.0245 

9.7794 

18.3185 

16 

1 

1 

1 

1r 

7.2056  ^B  8.9417  ^B 

10.6778 

25.0890 

Note: 

L  = 

light,  D  =  distance,  M 

=  movement,  FL  =  face  lock;  0  =  no. 

1  =  yes 
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In  the  left  plot  of  Fig.  1,  experiment  no.  5  (see  Table  1)  is  visualized.  The  conditions  for  runs  of 
experiments  of  this  kind  are  very  low  ambient  lighting  (no  ceiling  light  and  window  blinds 
closed),  the  subject  was  at  a  distance  of  3  ft  from  the  camera,  movements  were  avoided  as  far  as 
possible,  and  the  face  lock  was  disabled.  The  mean  and  the  standard  deviation  of  the  differences 
were  determined  and  the  95%  marks  of  agreement  (±1.96)  were  calculated.  The  plot  on  the  left 
shows  that,  for  this  experiment,  the  measured  differences  of  the  tested  method,  under  the  given 
conditions  in  95%  of  all  measurements,  are  no  larger  than  4.8  bpm.  The  average  bias  is  only 
0.19  bpm.  In  contrast,  the  bias  is  3.35  bpm  for  experiment  no.  4  (right  plot)  and  the  deviation  can 
be  as  high  as  11.1  bpm.  The  main  difference  between  these  two  experiments  is  the  movement 
parameter.  In  experiment  no.  4,  (natural)  movement  was  requested  to  a  certain  extent,  whereas  in 
experiment  5,  movements  were  suppressed  far  as  possible. 

Table  1  provides  an  overview  of  all  the  runs  and  over  all  participants.  It  can  be  clearly  seen  that 
movement  is  the  factor  that  most  influenced  the  accuracy  of  the  results.  Movements  by  the 
subject  during  data  recording  always  resulted  in  large  deviations.  Further  details  of  the  results  are 
discussed  in  Section  4. 


4.  Discussion 


In  this  study,  an  automatic  live-processing  method  for  non-contact,  webcam-based  HR  detection 
has  been  applied  and  tested  under  the  conditions  of  movement,  distance,  and  inadequate  ambient 
light.  All  these  conditions  have  been  tested  with  or  without  locking  the  device  to  focus  only  on 
the  forehead  area  of  the  participants.  To  cover  all  combinations  of  these  parameters,  16 
experiments  were  performed  and  analyzed.  For  each  pennutation,  participant  data  were  recorded 
for  a  duration  of  5  min.  Thus,  a  total  of  320  min  of  data  were  collected,  analyzed,  and  visualized. 
The  approach  was  to  compare  an  established  and  well-proven  clinical  HR  monitoring  system 
with  a  recently  implemented  HR  detection  algorithm  inspired  by  the  work  on  Eulerian  video 
magnification  (Wu  2012).  The  results,  summarized  in  Table  1,  show  that  reliable  results  are 
strongly  dependent  on  the  right  environmental  conditions.  As  can  be  seen  from  Table  1, 
movement  has  the  strongest  influence  on  the  measurement  result.  If  the  ROI  is  nearly  free  of 
movement,  the  discrepancy  between  the  2  methods  is  relative  low.  The  mean  bias  d  in  these 
cases  was  always  lower  than  0.46  bpm  and  under  optimal  conditions  was  as  low  as  0.076  bpm. 
This  indicates  that  the  2  applied  methods  are  systematically  producing  the  same  results. 

Although,  the  95%  limits  of  agreement  in  these  cases  show  that  a  deviation  of  up  to  7.5  bpm  is 
likely  even  under  good  conditions.  The  mean  deviation  under  the  no-movement  condition  is 
5.57  bpm.  Under  in  motion  condition,  the  results  are  burdened  with  large  deviations  and  are 
ambiguous;  the  tested  algorithm  mostly  failed  to  deliver  a  reliable  measurement  when  movement 
was  present.  The  mean  bias  was  always  high  in  these  cases  and  deviation  within  the  limits  of 
agreement  could  be  as  high  as  30  bpm. 
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It  can  also  be  seen  that  the  face-lock  condition  cannot  improve  the  accuracy  of  the  result 
dramatically.  This  probably  stems  from  the  fact  that  even  while  trying  to  remain  motionless, 
small  uncontrollable  movements  of  the  head  always  affect  the  measurement.  While  the  ROI  is 
fixed  in  the  image,  the  participant  moves  the  imaged  area  by  uncontrolled  movements.  This  leads 
to  motion  artifacts  that  influence  the  image  analysis. 

Compared  with  the  impact  of  movement,  varying  the  ambient  light  had  only  a  small  influence  on 
the  measurement  results  of  this  study.  Neglecting  those  influenced  by  movement,  we  obtained  an 
average  bias  d  of  0.22  bpm  for  the  measurements  in  the  darkened  environment  and  d  =  0. 1 8  bpm 
in  daylight. 

Hence,  the  results  of  the  webcam  method  tend  to  be  in  agreement  with  the  results  of  the  ECG 
method.  However,  within  the  95%  limits  of  agreement,  this  study  show  better  results  in  the 
darkened  environment  (4.77  bpm)  than  in  daylight  (5.72  bpm).  The  same  trend  can  also  be  seen 
even  when  the  less-precise  data  in  the  measurement  under  motion  of  the  subject  are  included: 

5.69  bpm  in  daylight  compared  with  7.34  bpm  in  the  darkened  environment.  Basically,  better 
results  would  be  expected  in  good  rather  than  in  poor  light  conditions.  The  reason  for  results 
contrary  to  this  expectation  is  probably  due  to  the  special  camera  technology.  The  camera  model 
used  employs  a  so-called  RightLight  2  technology,  which  automatically  adjusts  the  exposure  of 
the  image  to  compensate  for  dim  or  poorly  lit  settings.  With  this  technology,  faces  stand  out  from 
the  background  more  clearly,  so  that  the  algorithm  was  able  to  identify  and  focus  on  the  ROI 
faster  and  clearer. 


5.  Conclusions 


The  abilities  of  a  low-cost,  non-contact,  webcam-based  PPG  imaging  method  for  determining 
heart  rate  were  investigated  under  various  environmental  conditions.  It  has  been  shown  that  the 
particular  movements  of  the  subject  have  a  greater  influence  on  the  accuracy  of  the  measurement 
results  than  the  other  parameters.  However,  if  one  used  better  methods  for  motion  tracking,  these 
inaccuracies  would  likely  be  minimized.  A  modification  of  the  algorithm  that  implements  a 
motion- tracking  ability  should  be  considered  for  a  future  studies.  If  movement  can  be 
suppressed,  relatively  good  results  can  be  obtained  with  the  examined  method,  both  in  daylight 
as  well  as  under  low-light  conditions.  The  deviations  between  the  reference  system  and  webcam- 
based  method  in  these  cases  were,  on  average,  less  than  6  bpm.  Furthennore,  the  effect  of 
varying  ambient  light  was  tested  and  was  found  to  have  no  significant  influence  on  the  accuracy 
of  the  result.  Nevertheless,  one  must  consider  that  the  camera  used  here  automatically  enhanced 
the  exposure  of  the  face  of  the  subject  when  the  ambient  lighting  decreases. 

While,  even  under  best  conditions,  the  accuracy  of  the  iPPG  method  is  not  as  good  as  that  of  the 
reference  ECG  device,  its  non-contact,  non-intrusive,  and  low-cost  characteristics  still  provide  a 
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great  advantage.  If  a  good  estimation  of  HR  data  is  already  sufficient  and  the  trend  is  more 
interesting  than  the  exact  value  at  a  given  time,  the  iPPG  method  could  provide  adequate  results. 
Taken  this  into  account,  this  method  certainly  would  be  very  useful  in  adaptive  learning 
environments.  The  affective  states  of  the  learners  could  be  detected  in  a  wireless  manner  that 
would  not  interference  with  the  learning  process.  Even  though  this  study  merely  examined  the 
measurement  of  heart  rates,  it  should  be  kept  in  mind  that  by  extending  the  used  algorithm, 
parameters  such  as  respiratory  rate,  HRV,  and  arterial  blood  oxygen  saturation  could  be 
measured  as  well. 
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List  of  Symbols,  Abbreviations,  and  Acronyms 


AC 

affective  computing 

ARL 

US  Army  Research  Laboratory 

CMOS 

complementary  metal-oxide  semiconductor 

ECG 

electroc  ardio  graph 

ESEP 

Engineer  and  Scientists  Exchange  Program 

eft 

fast  Fourier  transfonn 

HR 

heart  rate 

HRED 

Human  Research  and  Engineering  Directorate 

HRV 

heart  rate  variability 

iPPG 

photoplethysmographic  imaging 

LED 

light-emitting  diode 

LITE 

Learning  in  Intelligent  Tutoring  Environments 

NASA 

National  Aeronautics  and  Space  Administration 

OpenCV 

Open  Computer  Vision 

PC 

personal  computer 

PPG 

photoplethysmography 

RPG 

red,  green,  blue 

STTC 

Simulation  and  Training  Technology  Center 

USB 

universal  serial  bus 
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